Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis
نویسندگان
چکیده
OBJECTIVE 1) To develop a machine learning approach for detecting stroke cases and subtypes from hospitalization data, 2) to assess algorithm performance and predictors on real-world data collected by a large-scale epidemiology study in the US; and 3) to identify directions for future development of high-precision stroke phenotypic signatures. MATERIALS AND METHODS We utilized 8,131 hospitalization events (ICD-9 codes 430-438) collected from the Greater Cincinnati/Northern Kentucky Stroke Study in 2005 and 2010. Detailed information from patients' medical records was abstracted for each event by trained research nurses. By analyzing the broad list of demographic and clinical variables, the machine learning algorithms predicted whether an event was a stroke case and, if so, the stroke subtype. The performance was validated on gold-standard labels adjudicated by stroke physicians, and results were compared with stroke classifications based on ICD-9 discharge codes, as well as labels determined by study nurses. RESULTS The best performing machine learning algorithm achieved a performance of 88.57%/93.81%/92.80%/93.30%/89.84%/98.01% (accuracy/precision/recall/F-measure/area under ROC curve/area under precision-recall curve) on stroke case detection. For detecting stroke subtypes, the algorithm yielded an overall accuracy of 87.39% and greater than 85% precision on individual subtypes. The machine learning algorithms significantly outperformed the ICD-9 method on all measures (P value<0.001). Their performance was comparable to that of study nurses, with better tradeoff between precision and recall. The feature selection uncovered a subset of predictive variables that could facilitate future development of effective stroke phenotyping algorithms. DISCUSSION AND CONCLUSIONS By analyzing a broad array of patient data, the machine learning technologies held promise for improving detection of stroke diagnosis, thus unlocking high statistical power for subsequent genetic and genomic studies.
منابع مشابه
Different Stroke Scales; Which Scale or Scales Should Be Use?
Background: There has been considerable development in the clinometric of stroke. But, researcher is concerned that some scales are too generic, inherently and the insight may not be provided. The current study was conducted to determine which scale or scales should be used in stroke survivors. Methods: We selected 67 studies which published between January 2010 and December 2018 from Up to d...
متن کاملDesign and psychometric properties of Iranian pre-hospital stroke scale
Background :The studies have shown that stroke morbidity and mortality could be decreased if early diagnosis and treatment is delivered faster for patients. This tool is designed based on all Pre-hospital stroke scales across the world as well as experiences of the emergency medicine specialists and pre-hospital emergency technicians to improve the diagnostic accuracy of the stroke scale in I...
متن کاملAnxiety and Depression Symptoms in Post Stroke Outpatients
Abstract Introduction: Given the high prevalence of stroke, it seems necessary to investigate and identify the factors that increase morbidity and mortality following stroke. Among these factors are mental disorders after stroke which lead to increased morbidity and mortality, independently of other risk factors and the severity of the disease. Objective: To investigate the frequency of depre...
متن کاملStroke mimics in patients with clinical signs of stroke
Background: Stroke mimic is a major diagnostic challenge and may be difficult to distinguish from real strokes. The aim of this study was to evaluate the relative frequency of stroke mimics in patients with clinical signs of stroke. Methods: In this cross sectional-study, the medical records of 1985 patients with stroke admitted to Poursina Hospital were enrolled using the census technique. ...
متن کاملEvaluation of Homocysteine Level as a Risk Factor among Patients with Ischemic Stroke and Its Subtypes
Background: Epidemiological research has shown that increased total homocysteine (tHcy) levels are associated with an increased risk of thromboembolic disease; however, controversy still exists over which subtype of stroke is allied to hyperhomocysteinemia. This study aimed to investigate whether elevated tHcy is an independent risk factor for ischemic stroke and to compare tHcy levels in patie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 13 شماره
صفحات -
تاریخ انتشار 2018